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Abstract 

We focus on a permutation betting market under 
parimutuel call auction model where traders bet on 
the final ranking of n candidates. We present a Pro- 
portional Betting mechanism for this market. Our 
mechanism allows the traders to bet on any subset 
of the n 2 'candidate-rank' pairs, and rewards them 
proportionally to the number of pairs that appear in 
the final outcome. We show that market organizer's 
decision problem for this mechanism can be formu- 
lated as a convex program of polynomial size. More 
importantly, the formulation yields a set of n 2 unique 
marginal prices that are sufficient to price the bets in 
this mechanism, and are computable in polynomial- 
time. The marginal prices reflect the traders' be- 
liefs about the marginal distributions over outcomes. 
We also propose techniques to compute the joint dis- 
tribution over n\ permutations from these marginal 
distributions. We show that using a maximum en- 
tropy criterion, we can obtain a concise parametric 
form (with only n 2 parameters) for the joint distri- 
bution which is defined over an exponentially large 
state space. We then present an approximation al- 
gorithm for computing the parameters of this distri- 
bution. In fact, the algorithm addresses the generic 
problem of finding the maximum entropy distribu- 
tion over permutations that has a given mean, and 
may be of independent interest. 

1 Introduction 

Prediction markets are increasingly used as an 
information aggregation device in academic re- 
search and public policy discussions. The fact 
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that traders must "put their money where their 
mouth is" when they say things via markets 
helps to collect information. To take full ad- 
vantage of this feature, however, we should ask 
markets the questions that would most inform 
our decisions, and encourage traders to say as 
many kinds of things as possible, so that a big 
picture can emerge from many pieces. Combina- 
torial betting markets hold great promise on this 
front. Here, the prices of contracts tied to the 
events have been shown to reflect the traders' 
belief about the probability of events. Thus, the 
pricing or ranking of possible outcomes in a com- 
binatorial market is an important research topic. 

We consider a permutation betting scenario 
where traders submit bids on final ranking of n 
candidates, for example, an election or a horse 
race. The possible outcomes are the n\ possi- 
ble orderings among the candidates, and hence 
there are 2 n! subset of events to bid on. In order 
to aggregate information about the probability 
distribution over the entire outcome space, one 
would like to allow bets on all these event com- 
binations. However, such betting mechanisms 
are not only intractable, but also exacerbate the 
thin market problems by dividing participants 
attention among an exponential number of out- 
comes 0, EH. Thus, there is a need for betting 
languages or mechanisms that could restrict the 
possible bid types to a tractable subset and at 
the same time provide substantial information 
about the traders' beliefs. 

1.1 Previous Work 

Previous work on parimutuel combinatorial mar- 
kets can be categorized under two types of mech- 
anisms: a) posted price mechanisms including 
the Logarithmic Market Scoring Rule (LMSR) 
of Hanson [ll], QiJ and the Dynamic Pari-mutuel 
Market-Maker (DPM) of Pennock [13] b) call 
auction models developed by Lange and Econo- 
mides [131 ] . Peters et al. 171 ] . in which all the 
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orders are collected and processed together at 
once. An extension of the call auction mecha- 
nism to a dynamic setting similar to the posted 
price mechanisms, and a comparison between 
these models can be found in Peters et al. 

Chen et al. (2008) [4| analyze the computa- 
tional complexity of market maker pricing algo- 
rithms for combinatorial prediction markets un- 
der LMSR model. They examine both permu- 
tation combinatorics, where outcomes are per- 
mutations of objects, and Boolean combina- 
torics, where outcomes are combinations of bi- 
nary events. Even with severely limited lan- 
guages, they find that LMSR pricing is #P- 
hard, even when the same language admits 
polynomial-time matching without the market 
maker. Chen, Goel, and Pennock 0] study a spe- 
cial case of Boolean combinatorics and provide a 
polynomial-time algorithm for LMSR pricing in 
this setting based on a Bayesian network repre- 
sentation of prices. They also show that LMSR 
pricing is NP-hard for a more general bidding 
language. 

More closely related to our work are the stud- 
ies by Fortnow et al. and Chen et al. (2006) 
[H] on call auction combinatorial betting mar- 
kets. Fortnow et al. [1] study the computational 
complexity of finding acceptable trades among a 
set of bids in a Boolean combinatorial market. 
Chen et al. (2006) [1] analyze the auctioneer's 
matching problem for betting on permutations, 
examining two bidding languages. Subset bets 
are bets of the form candidate i finishes in posi- 
tions x, y, or z or candidate i, j, or k finishes in 
position x. Pair bets are of the form candidate i 
beats candidate j. They give a polynomial-time 
algorithm for matching divisible subset bets, but 
show that matching pair bets is NP-hard. 

1.2 Our Contribution 

This paper extends the above-mentioned work in 
a variety of ways. We propose a more general- 
ized betting language called Proportional Bet- 
ting that encompasses Subset Betting [Bj as a 
special case. Further, we believe that ours is the 
first result on pricing a parimutuel call auction 
under permutation betting scenario. 

In our proportional betting mechanism, the 
traders bet on one or more of the n 2 'candidate- 
position' pairs, and receive rewards proportional 



to the number of pairs that appear in the final 
outcome. For example, a trader may place an 
order of the form "Horse A will finish in position 
2 OR Horse B will finish in position 4" . He Q 
will receive a reward of $2 if both Horse A & 
Horse B finish at the specified positions 2 & 4 
respectively; and a reward of $1 if only one horse 
finishes at the position specified. The market 
organizer collects all the orders and then decides 
which orders to accept in order to maximize his 
worst case profit. 

In particular, we will present the following re- 
sults: 

• We show that the market organizer's de- 
cision problem for this mechanism can be 
formulated as a convex program with only 
0(n 2 + m) variables and constraints, where 
m is the number of bidders. 

• We show that we can obtain, in polynomial- 
time, a small set (n 2 ) of 'marginal prices' 
that satisfy the desired price consistency 
constraints, and are sufficient to price the 
bets in this mechanism. 

• We show that by introducing non-zero 
starting orders, our mechanism will produce 
unique marginal prices. 

• We suggest a maximum entropy criteria to 
obtain a maximum-entropy joint distribu- 
tion over the n\ outcomes from the marginal 
prices. Although defined over an exponen- 
tial space, this distribution has a concise 
parametric form involving only n 2 param- 
eters. Moreover, it is shown to agree with 
the maximum-likelihood distribution when 
prices are interpreted as observed statistics 
from the traders' beliefs. 

• We present an approximation algorithm to 
compute the parameters of the maximum 
entropy joint distribution to any given accu- 
racy in (pseudo)-polynomial timeo In fact, 
this algorithm can be directly applied to the 
generic problem of finding the maximum en- 
tropy distribution over permutations that 
has a given expected value, and may be of 
independent interest. 

ll he' shall stand for 'he or she' 

2 The approximation factors and running time will be 
established precisely in the text. 
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2 Background 

In this section, we briefly describe the Convex 
Parimutuel Call Auction Model (CPCAM) de- 
veloped by Peters et al. [17|] that will form the 
basis of our betting mechanism. 

Consider a market with one organizer and m 
traders or bidders. There are S states of the 
world in the future on which the traders are sub- 
mitting bids. For each bid that is accepted by 
the organizer and contains the realized future 
state, the organizer will pay the bidder some 
fixed amount of money, which is assumed to be 
$1 without loss of generality. The organizer col- 
lects all the bids and decides which bids to ac- 
cept in order to maximize his worst case profit. 

Let Ojfc £ {0, 1} denote the trader /c's bid for 
state i. Let q k and irk denote the limit quantity 
and limit price for trader k, i.e., trader /c's maxi- 
mum number of orders requested and maximum 
price for the bid, respectively. The number of 
orders accepted for trader k is denoted by x k , 
and pi denotes the price computed for outcome 
state i. Xfc is allowed to take fractional values, 
that is, the orders are 'divisible' in the terminol- 
ogy of [5]. Below is the convex formulation of 
the market organizer's problem given by [13]: 

max 7r T x - r + n J2i=i &i l°g( s 

x,s,r 

s. t. J2 k a ik x k + Si = r 1 <i < S 

0<x<q 
s>0 

(1) 

The 'parimutuel' price vector {pi}f = i is given 
by the dual variables associated with the first 
set of constraints. The parimutuel property im- 
plies that when the bidders are charged a price 
of CLikPi} ■, instead of their limit price, the 
payouts made to the bidders are exactly funded 
by the money collected from the accepted or- 
ders in the worst-case outcome. 9 > represents 
starting orders needed to guarantee uniqueness 
of these state prices in the solution, fj, > is the 
weight given to the starting order term. 

The significance of starting orders needs a spe- 
cial mention here. Without the starting orders, 
(PQ) would be a linear program with multiple dual 
solutions. Introducing the convex barrier term 
involving 9 makes the dual strictly convex result- 
ing in a unique optimal price vector. To under- 
stand its effect on the computed prices, consider 



the dual problem for ([I]): 

min q T y - fi ]Pf =1 9i log(pi) 

y,p 

s.t. Y.iPi = l 

Yli a ikPi + Vk>K k VA; 
y >0 

Observe that if 9 is normalized, the second term 
in the objective gives the K-L distanc^l of 9 from 
p (less a constant term ^#jlog#j). Thus, when 
\x is small, the above program optimizes the first 
term q T y, and among all these optimal price vec- 
tors picks the one that minimizes the K-L dis- 
tance of p from 9. As discussed in the introduc- 
tion, the price vectors are of special interest due 
to their interpretation as outcome distributions. 
Thus, the starting orders enable us to choose the 
unique distribution p that is closest (minimum 
K-L distance) to a prior specified through 9. 

The CPCAM model shares many desirable 
properties with the limit order parimutuel call 
auction model originally developed by Lange 
and Economides [la ]. Some of its important 
properties from information aggregation per- 
spective are 1) it produces a self-funded auction, 
2) it creates more liquidity by allowing multi- 
lateral order matching, 3) the prices generated 
satisfy "price consistency constraints", that is, 
the market organizer agrees to accept the orders 
with a limit price greater than the calculated 
price of the order while rejecting any order with 
a lower limit price. The price consistency con- 
straints ensure the traders that their orders are 
being duly considered by the market organizer, 
and provide incentive for informed traders to 
trade whenever their information would change 
the price. Furthermore, it is valuable that the 
model has a unique optimum and produces a 
unique price vector. 

Although the above model has many powerful 
properties, its call auction setting suffers from 
the drawback of a delayed decision. The traders 
are not sure about the acceptance of their or- 
ders until after the market is closed. Also, it is 
difficult to determine the optimal bidding strat- 
egy for the traders and ensure truthfulness. In 

3 The Kullback Leibler distance (KL-distance) is a 
measure of the difference between two probability distri- 
butions. The K-L distance of a distribution p from 9 is 
given by £i&k>g£- 
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a consecutive work, Peters et al. [la] intro- 
duced a "Sequential Convex Parimutuel Mech- 
anism (SCPM)" which is an extension of the 
CPCAM model to a dynamic setting, and has 
additional properties of immediate decision and 
truthfulness in a myopic sense. The techniques 
discussed in this paper assume a call auction set- 
ting, but can be directly applied to this sequen- 
tial extension. 

3 Permutation Betting Mecha- 
nisms 

In this section, we propose new mechanisms for 
betting on permutations under the parimutuel 
call auction model described above. Consider a 
permutation betting scenario with n candidates. 
Traders bet on rankings of the candidates in the 
final outcome. The final outcome is represented 
by an n x n permutation matrix, where ij entry 
of the matrix is 1 if the candidate i takes posi- 
tion j in the final outcome and otherwise. We 
propose betting mechanisms that restrict the ad- 
missible bet types to 'set of candidate-position 
pairs'. Thus, the trader fc's bet will be specified 
by an n x n (0, 1) matrix A}., with 1 in the entries 
corresponding to the candidate-position pairs he 
is bidding on. We will refer to this matrix as the 
'bidding matrix' of the trader. If the trader's bid 
is accepted, he will receive some payout in the 
event that his bid is a "winning bid" . 

Depending on how this payout is determined, 
two variations of this mechanism are examined: 
a) Fixed Reward Betting and b) Proportional 
Betting. The intractability of fixed reward bet- 
ting will provide motivation to examine propor- 
tional betting more closely, which is the focus of 
this paper. 

Fixed reward betting In this mechanism, 
a trader receives a fixed payout (assume $1 
w.l.o.g.) if any entry in his bidding matrix 
matches with the corresponding entry in the 
outcome permutation matrix. That is, if M is 
the outcome permutation matrix, then the pay- 
out made to trader k is given by 1(A)- • M > 0). 
Here, the operator '•' denotes the Frobenius 



inner produdQ, and /(•) denotes an indicator 
function. The market organizer must decide 
which bids to accept in order to maximize the 
worst case profit. Using the same notations as 
in the CPCAM model described in Section [2] 
for limit price, limit quantities, and accepted 
orders, the problem for the market organizer in 
this mechanism can be formulated as follows: 

max tt t x — r 

s- t. r > Y7=i I (Ah • M a > 0)x k Va G S n 
0<x<q 

(2) 

Here, S n represents the set of n dimensional per- 
mutations, M a represents the permutation ma- 
trix corresponding to permutation a. Note that 
this formulation encodes the problem of maxi- 
mizing the worst-case profit of the organizer with 
no starting orders. 

Above is a linear program with exponential 
number of constraints. We prove the following 
theorem regarding the complexity of solving this 
linear program. 

Theorem 3.1. The optimization problem in 
is NP-hard even for the case when there are only 
two non-zero entries in each bidding matrix. 

Proof. The separation problem for the linear 
program in ([2]) corresponds to finding the per- 
mutation that "satisfies" maximum number of 
bidders. Here, an outcome permutation is said 
to "satisfy" a bidder, if his bidding matrix has at 
least one coincident entry with the permutation 
matrix. We show that the separation problem is 
NP-hard using a reduction from maximum sat- 
isfiability (MAX-SAT) problem. In this reduc- 
tion, the clauses in the MAX-SAT instance will 
be mapped to bidders in the bidding problem. 
And, the number of non-zero entries in a bidding 
matrix will be equal to the number of variables 
in the corresponding clause. Since, MAX-2-SAT 
is NP-hard, this reduction will prove the NP- 
hardness even for the case when each bidding 
matrix is restricted to have only two non-zero 

4 The Frobenius inner product, denoted as A • B in 
this paper, is the component- wise inner product of two 
matrices as though they are vectors. That is, 

A • B = ]P AijBij 
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entries. See the appendix for the complete re- 
duction. 

Using the result on equivalence of separation 
and optimization problem from 0] , the theorem 
follows. □ 

This result motivates us to examine the fol- 
lowing variation of this mechanism which makes 
payouts proportional to the number of winning 
entries in the bidding matrix. 

Proportional betting In this mechanism, 
the trader receives a fixed payout (assume $1 
w.l.o.g.) for each coincident entry between the 
bidding matrix A k and the outcome permu- 
tation matrix. Thus, the payoff of a trader is 
given by the Probenius inner product of his 
bidding matrix and the outcome permutation 
matrix. The problem for the market organizer 
in this mechanism can be formulated as follows: 

max ir T x — r 

s- t. r > Zk=Mk • M a )x k Vex G S n (3) 
< x < q 

The above linear program involves exponen- 
tial number of constraints. However, the sepa- 
ration problem for this program is polynomial- 
time solvable, since it corresponds to finding the 
maximum weight matching in a complete bipar- 
tite graph, where weights of the edges are given 
by elements of the matrix Q^ fe A k Xk)- Thus, 
the ellipsoid method with this separating ora- 
cle would give a polynomial-time algorithm for 
solving this problem. This approach is similar to 
the algorithm proposed in [5] for Subset Betting. 
Indeed, for the case of subset betting the two 
mechanisms proposed here are equivalent. This 
is because subset betting can be equivalently for- 
mulated under our framework, mechanism 
that allows non-zero entries only on a single row 
or column of the bidding matrix A k . Hence, the 
number of entries that are coincident with the 
outcome permutation matrix can be either or 
1, resulting in I(A k • M a > 0) = A k • M a , for 
all permutations a. Thus, subset betting forms 
a special case of the proportional betting mech- 
anism proposed here, and all the techniques de- 
rived in the sequel for proportional betting will 
directly apply to it. 



4 Pricing in Proportional Bet- 
ting 

In this section, we reformulate the market orga- 
nizer's problem for Proportional Betting into a 
compact linear program involving only 0(n 2 + 
m) constraints. Not only the new formulation is 
faster to solve in practice (using interior point 
methods), but also it will generate a compact 
dual price vector of size n 2 . These 'marginal 
prices' will be sufficient to price the bets in Pro- 
portional Betting, and are shown to satisfy some 
useful properties. The reformulation will also 
allow introducing n 2 starting orders in order to 
obtain unique prices. 

Observe that the first constraint in (|3|) 
implicitly sets r as the worst case payoff over 
all possible permutations (or matchings). Since 
the matching polytope is integral [9(, r can 
be equivalently set as the result of following 
linear program that computes maximum weight 
matching: 

r = max {YJk=i x k^k) • M 
s.t. M T e = e 

Me = e 

Mij>0 l<i,j<n 

Here e denotes the vector of all Is (column vec- 
tor). Taking dual, equivalently, 

r = min e T v + e T w 

v,w 

S.t. Vi+Wj >YH°=i( x k A k)ij Vi,j 

Here, (x k A k )ij denotes the ij th element of the 
matrix (x k A k ). The market organizer's problem 
in ([3]) can now be formulated as: 

T T T 

max 7r x — e v — e w 

x,v,w 

s.t. Vi + wj > Y!k=i( x kAk)ij Vi,j ( 4 ) 
< x < q 

Observe that this problem involves only n 2 + 2m 
constraints. As we show later, the n 2 dual vari- 
ables for the first set of constraints can be well 
interpreted as marginal prices. However, the 
dual solutions for this problem are not guaran- 
teed to be unique. To ensure uniqueness, we can 
use starting orders as discussed for the CPCAM 
model in Section [2j After introducing one start- 
ing order 9. L j > for each candidate-position 
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pair, and slack variables for each of the n 2 
constraints, we get the following problem: 



max 

x,v,w,s 



tt t x - e T v - e T w + 9ij l°g( s ij) 

s.t. Vi + Wj - Sij = YJk=ii x kAk)ij Vi, j 

Vi, j 

(5) 



> 
< x < 

and its dual: 



mm q T y-'£ ij 9i j \og(Qi j ) 
y,Q J 
s.t. Qe = e 

Q T e = e ( 6 ) 

M • Q + yk > TTfe VA; 

y>o 

Next, we will show that model ([5]) and ((6]) 
possess many desirable characteristics. 

Lemma 4.1. Model |5]j and (EJ) are convex pro- 
grams. And if Oij > Q,Vi,j, the solution to f6|) 
is unique in Q. 

Proof. Since logarithmic function is concave and 
the constraints are linear, we can easily verify 
that ([5]) and ([6|) are convex programs. Also, ac- 
cording to our assumption on 9, the objective 
function in ([6]) is strictly convex in Q. Thus, the 
optimal solution of ([6]) must be unique in Q. □ 

Therefore, we know that this program can be 
solved up to any given accuracy in polynomial 
time using convex programming methods and 
produces unique dual solution Q. 

We show that the dual matrix Q generated 
from ([f)D is well interpreted as a "parimutuel 
price". That is, Q > 0; and, if we charge each 
trader k a price of Aj. • Q instead of their limit 
price (7Tfc), then the optimal decision remains un- 
changed and the total premium paid by the ac- 
cepted orders will be equal to the total payout 
made in the worst case. Further, we will show 
that Q satisfies the following extended defini- 
tion of "price consistency condition" introduced 
in FlF 



Definition 4.2. The price matrix Q satisfies the 
price consistency constraints if and only if for all 



Xj = 
< Xj < qj 
Xj = qj 



Q • Aj 
Q • Aj 
Q • Aj 



Cj > TTj 

c i = Kj 

Cj < TTj 



That is, a trader's bid is accepted only if his limit 
price is greater than the calculated price for the 
order. 

To see this, we construct the Lagrangian func- 
tion for program ([5]): 

L(x,Q,s,v,w,y) 



Tr- 



ie - e T v - e T w + Y^,i j 6ij log s„ 



Vi - Wj) 

Now, we can derive the KKT conditions: 



+ Y^ivMi - x i) 



TT k -Q»A k -y k <0 
Xk ■ (tt& - Q • A k - y k ) 
Qe = e 
Q T e = e 



Qij < 



yk ■ (x k 
y>o 



Qij) = o 

ik) = 



1 < k < m 
1< k < m 



1 < i, j < n 
1 < i, j < n 
1 < k < m 



Since Sij > for any optimal solution, the 
above conditions imply that Qij = or = 

6 

^j 2 - for all ij. Since, Q, L j > 0, this implies Qij > 0, 
for all ij. Also, the first constraint in the primal 
problem §§§ now gives: Vi + uij = Y,k( x kAk)ij + 
Multiplying with Qij, and summing over 



Qij 

all 

r 



e T v + e T w = J2k x k(Ak • Q) + J2ij Oij 



Since, r gives the worst case payoff, charging the 
bidders according to price matrix Q results in a 
parimutuel market (except for the amount in- 
vested in the starting orders, an issue that we 
address later) . Also, if we replace irk with A k »Q 
in the above KKT conditions and set y k = 0, 
the solution x,s,Q will still satisfy all the KKT 
conditions. Thus, the optimal solution remains 
unchanged. Further, observe that the first two 
conditions along with the penultimate one are 
exactly the price consistency constraints. Hence, 
Q must satisfy the price consistency constraints. 

In the above model, market organizer needs to 
seed the market with the starting orders #y in 
order to ensure uniqueness of the optimum state 
price matrix. The market organizer could actu- 
ally lose this seed money in some outcomes. In 
practice, we can set the starting orders to be very 
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small so that this is not an issue. On the other 
hand, it is natural to ask whether the starting or- 
ders can be removed altogether from the model 
to make the market absolutely parimutuel. The 
following lemma shows that this is indeed possi- 
ble. 

Lemma 4.3. For any given starting orders 9, 
as we reduce uniformly to 0, the price matrix 
converges to a unique limit Q, and this limit is 
an optimal dual price for the model without the 
starting orders given in Q). 

Proof. The proof of this lemma follows directly 
from the discussion in Section 3.1 of 171. □ 



Moreover, as discussed in [13], such a limit 
Q can be computed efficiently u sing the path- 
following algorithm developed in [211 ] . 

To summarize, we have shown that: 

Theorem 4.4. One can compute in polynomial- 
time, an n x n marginal price matrix Q which 
is sufficient to price the bets in the Proportional 
Betting mechanism. Further, the price matrix 
is unique, parimutuel, and satisfies the desired 
price- consistency constraints. 

5 Pricing the Outcome Permu- 
tations 

There is analytical as well as empirical evidence 
that prediction market prices provide useful es- 
timates of average beliefs about the probabil- 
ity that an event occurs [H, [HI, [H, H3] • There- 
fore, prices associated with contracts are typi- 
cally treated as predictions of the probability of 
future events. The marginal price matrix Q de- 
rived in the previous section associates a price to 
each candidate-position pair. Also, observe that 
Q is a doubly-stochastic matrix (refer to the con- 
straints in dual problem ©). Thus, the distribu- 
tions given by a row (column) of Q could be in- 
terpreted as marginal distribution over positions 
for a given candidate (candidates for a given po- 
sition) . One would like to compute the complete 
price vector that assigns a price to each of the n\ 
outcome permutations. This price vector would 
provide information regarding the joint proba- 
bility distribution over the entire outcome space. 
In this section, we discuss methods for comput- 
ing this complete price vector from the marginal 
prices given by Q. 



Let Pa denote the price for permutation a. 
Then, the constraints on the price vector p are 
represented as: 

Pa > Vcr E S n 



(7) 



Note that the above constraints implicitly im- 
pose the constraint ^2 a Pa = L Thus, {p a } is 
a valid distribution. Also, it is easy to establish 
that if Q is an optimal marginal price matrix, 
then any such {p a } is an optimal joint price vec- 
tor over permutations. That is, 

Lemma 5.1. If Q is an optimal dual solution 
for Q), then any price vector {p a } that satisfies 
the constraints in ([Tj) is an optimal dual solution 

for m>- 

Proof. The result follows directly from the struc- 
ture of the two dual problems. See appendix for 
a detailed proof. □ 

Finding a feasible solution under these con- 
straints is equivalent to finding a decomposi- 
tion of doubly-stochastic matrix Q into a con- 
vex combination ofnxn permutation matrices. 
There are multiple such decompositions possi- 
ble. For example, one such solution can be ob- 
tained using Birkhoff-von Neumann decomposi- 
tion @,0|. Next, we propose a criterion to choose 
a meaningful distribution p from the set of dis- 
tributions satisfying constraints in ([7]). 

5.1 Maximum entropy criterion 

Intuitively, we would like to use all the informa- 
tion about the marginal distributions that we 
have, but avoid including any information that 
we do not have. This intuition is captured by the 
'Principle of Maximum Entropy'. It states that 
the least biased distribution that encodes cer- 
tain given information is that which maximizes 
the information entropy. 

Therefore, we consider the problem of find- 
ing the maximum entropy distribution over the 
space of n dimensional permutations, satisfying 
the above constraints on the marginal distribu- 
tions. The problem can be represented as fol- 
lows: 

min E CT eS„^ lo g2V 

s-t. Y,oeS n P° M ° = Q (8) 

Pa > 

The maximum entropy distribution obtained 
from above has many nice properties. Firstly, 
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as we show next, the distribution has a con- 
cise representation in terms of only n 2 param- 
eters. This property is crucial for combinatorial 
betting due to the exponential state space over 
which the distribution is defined. Let Y E R nxn 
be the Lagrangian dual variable corresponding 
to the marginal distribution constraints in ([8]), 
and s a be the dual variables corresponding to 
non- negativity constraints on p a . Then, the 
KKT conditions for (JS|) are given by: 

log(p ff ) + 1 - s a = Y • M a 

Sa,Pa > VCT 
PaSa = VCT 

Assuming p a > for all a, this gives p a = 
e Y»M CT -i_ Thus, the distribution is completely 
specified by the n 2 parameters given by Y. Once 
Y is known, it is possible to perform operations 
like computing the probability for a given set of 
outcomes, or sampling the highly probable out- 
comes. 

Further, we show that the dual solution Y is 
a maximum likelihood estimator of distribution 
parameters under suitable interpretation of Q. 

Maximum likelihood interpretation For a 

fixed set of data and an assumed underlying 
probability model, maximum likelihood estima- 
tion method picks the values of the model pa- 
rameters that make the data "more likely" than 
any other values of the parameters would make 
them. Let us assume in our model that the 
traders' beliefs about the outcome come from 
an exponential family of distributions D^, with 
probability density function of the form f v cx 
e ri»M a £ Qr some parameter rj E R nxn . Suppose 
Q gives a summary statistics of s sample ob- 
servations {M 1 , M 2 , . . . , M s } from the traders' 
beliefs, i.e., Q = \ Ylk^ k - This assumption 
is inline with the interpretation of the prices in 
prediction markets as mean belief of the traders. 

Then, the maximum likelihood estimator rj of 
rj is the value that maximizes the likelihood of 
these observations, that is: 

rj = argmax^log/ r) (M 1 ,M 2 , . . . ,M S ) 

= argmax,, log (TIfc ^ Ma ) 

The optimality conditions for the above uncon- 
strained convex program are: 



where Z is the normalizing constant, Z = 
Y^ a e r] * Ma . Since \ ^ k M k = Q, observe from 
the KKT conditions for the maximum entropy 
model given in that r/ = Y satisfies the above 
optimality conditions. Hence, the parameter Y 
computed from the maximum entropy model is 
also the maximum likelihood estimator for the 
model parameters rj. 

5.2 Complexity of the Maximum En- 
tropy Model 

In this section, we analyze the complexity of 
solving the maximum entropy model in ([8]). As 
shown in the previous section, the solution to 
this model is given by the parametric distribu- 
tion p a = e Y ' M "~ 1 , The parameters Y are the 
dual variables given by the optimal solution to 
the following dual problem of (HI) 

max Q.Y-^e™*- 1 (10) 

We prove the following result regarding the com- 
plexity of computing the parameters Y: 

Theorem 5.2. It is #P-hard to compute the 
parameters of the maximum entropy distribution 
{p a } over n dimensional permutations a E S n , 
that has a given marginal distribution. 

Proof. We make a reduction from the following 
problem: 

Permanent of a (0, 1) matrix The permanent 
of an n x n matrix B is defined as perm(S) = 
E CT e5„ U i=i B i.a(i)- Computing permanent of a 
(0, 1) matrix is #P-hard |lg|. 

We use the observation that e Y ' MtT = 
perm(e^), where the notation e Y is used to mean 
component- wise exponentiation: (e )ij = e <3 '. 
For complete proof, see the appendix. □ 

Interestingly, there exists an FPTAS based on 
MCMC methods for computing the permanent 
of any non-negative matrix Next, we derive 
a polynomial-time algorithm for approximately 
computing the parameter Y that uses this FP- 
TAS along with the ellipsoid method for opti- 
mization. 

5.3 An Approximation Algorithm 

In this section, we develop an approximation al- 
gorithm to compute the parameters Y. We first 
relax the formulation in ([8]) to get an equivalent 
problem that will lead to a better bounded dual. 
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Consider the problem below: 

min *Z,p a (}ogp a - 1) 
s.t. Y,VcM a < Q 

Pa>0 



111) 



We prove the following lemma: 

Lemma 5.3. The problem in Ul\) has the same 
optimal solution as {5^. 

Proof. See the appendix. □ 



The Lagrangian dual of this problem is given 



by: 



max Q»Y e Y ' Ma 



,t. y<o ^ 

Note that Y is bounded from above. Next, we 
establish lower bounds on the variable Y. These 
bounds will be useful in proving polynomial run- 
ning time for ellipsoid method. 

Lemma 5.4. The optimal value OPT and the 
optimal solution Y to satisfy the following 
bounds: 



> OPT > -nlogn-1 
> Yij > -n\ogn/q mi , 

Here, q min = min{Qf/}. 
Proof. See the appendix. 



□ 



Remark: Note that if Qij = for any in 
a pre-processing step we could set the corre- 
sponding Yij to — oo and remove it from the 
problem. So, w.l.o.g. we can assume q m in > 0. 
However, some Qij could be very small, making 
the above bounds very large. One way to handle 
this would be to set very small QijS (say less 
than 5 for some small 5 > 0) to 0, and remove 
the corresponding from the problem. This 
would introduce a small additive approximation 
of 5 in the constraints of the problem, but 
ensure that q m in > 5. 

From KKT conditions for the above problem, 



we obtain that p a = e 



YmM a 



at optimality. Sub- 



stituting p a into the primal constraints ^ p a = 1 
and ^2paM a < Q, we can obtain the follow- 
ing equivalent dual problem with additional con- 
straints: 



max Q • Y — 1 

s.t. J2e Y ' Ma M a 

Yij > (-nk)£ 
Y i:j < 



< Q 

\n)/q, 



inn) VijJ 

Vi, j 



(13) 



The problem can be equivalently formulated 
as that of finding a feasible point in the convex 
body K defined as: 



K: 



Q.y-i >t 

J2e Y ' M °M a < Q 



Y 
Y 



< 



-n\ogn)/q 



Vi, j 
Vi, j 



Here, t is a fixed parameter. An optimal solution 
to (|13p can be found by binary search on f 6 
[— ralogn — 1,0]. We define an approximate set 
K e by modifying the RHS of second constraint 
in K defined above to Q(l + e). Here, e is a fixed 
parameter. 

Next, we show that the ellipsoid method can 
be used to generate (1 + e)-approximate solution 
Y. We will make use of the following lemma 
that bounds the gradient of the convex func- 
tion f(Y) = ^2 e Y ' Ma M a appearing in the con- 
straints of the problem. 

Lemma 5.5. For any ij, the gradient of the 
function g(Y) = f^Y) = £ CT e Y ' M ° (M ff )y sat- 
isfies the following bounds: 

||V 5 (y)|| 2 < ng(Y) <V^\\Vg(Y)\\ 2 



Proof. See the appendix. 



□ 



Now, we can obtain an approximate separat- 
ing oracle for the ellipsoid method. 

Lemma 5.6. Given any Y £ R nxn , and any 
parameter e > 0, there exists an algorithm with 
running time polynomial in n, 1/e and l/q m i n 
that does one of the following: 

• asserts that Y £ K e 

• or, finds C G R nxn such that C»X < C»Y 
for every X e K. 

Algorithm 

1 . If Y violates any constraints other than the 
constraint on f(Y), report Y ^ K. The 
violated inequality gives the separating hy- 
perplane. 

2. Otherwise, compute a (l±<5)-approximation 
f(Y) of f(Y), where 5 = min{^, 1}. 

(a) If f(Y) < (1 + 35)Q, then report Y G 
K f . 
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(b) Otherwise, say ij constraint is 
violated. Compute a (1 ± 7)- 
approximation of the gradient of the 
function g(Y) = fij(Y), where 7 = 
Sqmin/Zn 4 . The approximate gradient 
C = Vg(Y) gives the desired separat- 
ing hyper plane. 

Running time Observe that fij(Y) = 
-peTm(e Y ^), where Y[- denotes the matrix ob- 
tained from Y after removing the row i and col- 
umn j. Thus, (1 ± 5) approximation to f(Y) 
can be obtained in time poly nomial in n, 1/5 us- 
ing the FPTAS given in [12| for computing per- 
manent of a non-negative matrix. Since, 1/5 is 
polynomial in n, 1/e, l/q m i n , this gives polyno- 
mial running time for estimating f(Y). Simi- 
lar observations hold for estimating the gradient 
Vfij(Y) in above. 

Correctness The correctness of the above al- 
gorithm is established by the following two lem- 
mas: 

Lemma 5.7. If f(Y) < (1 + 35)Q and all the 
other constraints are satisfied, then Y G K £ . 

Proof. See the appendix. □ 

Lemma 5.8. Suppose the ij th constraint is vi- 
olated, i.e., fij(Y) > (1 + 35)Qij. Then, C = 
Vfij(Y) gives a separating hyperplane for K, 
that is, C • (X — Y)< 0,\/X G K. 

Proof. See the appendix. The proof uses the 
bounds on X, Y and Vfij(Y) established in 
Lemma 15.41 and Lemma 15.51 respectively. □ 

Theorem 5.9. Using the separating oracle 
given by Lemma [57S[ with the ellipsoid method, a 
distribution {p a } over permutations can be con- 
structed in time poly(n, -, ), such that 

• (l-e)Q < E CT ^M CT < Q 

• p has close to maximum entropy, i.e., 
X^Po-logj^ ^ (1 ~~ e)OPT E , where 
OPTe(< 0) is the optimal value of©. 

Proof. Using the above separating oracle with 
the ellipsoid method 0, after polynomial num- 
ber of iterations we will either get a solution 
Y G K e (i), or declare that there is no feasi- 
ble solution. Thus, by binary search over the 



t, we can get a solution Y such that Y G K e 
and Q • Y - 1 > OPT. The dual solution 
thus obtained will have an objective value equal 
to or better than optimal but may be infeasi- 
ble. We reduce each of the YijS by a small 
amount log(l + e)) to construct a new fea- 
sible but sub-optimal solution Y. Some simple 
algebraic manipulations show that the new solu- 
tion Y satisfies: (1 - e)Q < £ a e Y * M "M a < Q 
Thus, Y is a feasible solution to the dual, and, 
Q»Y — 1 < OPT. We can now construct the dis- 
tribution p a as p a = e Y ' M(J . Then from above, 
{l-e)Q<Y /(T PaM a <Q. Also, 

HaPAogPa = Y.^ Y ' M °M a .Y 

< (l-e)Q.y 

< {1- e){OPT + 1) 
= (1 - e)OPT E 

□ 

6 Conclusion 

We introduced a Proportional Betting mecha- 
nism for permutation betting which can be read- 
ily implemented by solving a convex program of 
polynomial size. More importantly, the mecha- 
nism was shown to admit an efficient parimutuel 
pricing scheme, wherein only n 2 marginal prices 
were needed to price the bets. Further, we 
demonstrated that these marginal prices can be 
used to construct meaningful joint distributions 
over the exponential outcome space. 

The proposed proportional betting mecha- 
nism was developed by relaxing a 'fixed reward 
betting mechanism'. An interesting question 
raised by this work is whether the fixed bet- 
ting mechanism could provide further informa- 
tion about the outcome distribution. Or, in gen- 
eral, how does the complexity of the betting lan- 
guage relates to the information collected from 
the market? A positive answer to this ques- 
tion would justify exploring approximation algo- 
rithms for the more complex fixed reward bet- 
ting mechanism. 

Acknowledgements We thank Arash Asad- 
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discussions. 
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APPENDIX 

Proof of Theorem 13.11 Consider the com- 
plete bipartite graph with the n candidates in 
one set and the n positions in the other set. In 
our betting mechanism, each bidder k bids on 
a subset of edges in this graph which is given 
by the non-zero entries in his bidding matrix 
A k . A bidder is "satisfied" by a matching (or 
permutation) in this graph if at least one of the 
edges he bid on occurs in the matching. The 
separation problem for the linear program in 
([2]) corresponds to finding the matching that 
satisfies the maximum number of bidders. Thus, 
it can be equivalently stated as the following 
matching problem: 

Matching problem: Given a complete bipartite 
graph K n>n = (Vi,V2,E), and a collection 
C = {Ei,E 2 , . . . , E m } of m subsets of E. Find 
the perfect matching M C E that intersects 
with maximum number of subsets in C. 

MAX-SAT problem: Given a boolean formula 
in CNF form, determine an assignment of {0, 1} 
to the variables in the formula that satisfies the 
maximum number of clauses. 

Reduction from MAX-SAT to our matching 
problem: Given the boolean formula in MAX- 
SAT problem with n variables x, y, z Con- 
struct a complete bipartite graph Km^n as fol- 
lows. For each variable x, add two nodes x and 
x' to the graph. And, for the possible values 
and 1 of x, construct two nodes xq and x\. Con- 
nect by edges all the nodes corresponding to the 
variables to all the nodes corresponding to the 
values. Now, create the collection C as follows. 
For k th clause in the boolean formula, create a 
set Ek in C. For each negated variable x in this 
clause, add edge (x, xq) to E k ; and for each non- 
negated variable x in the clause, add an edge 
(x,xi) to E k . 

We show that every solution of size I for the 
MAX-SAT instance corresponds to a solution of 
size I for the constructed matching problem in- 
stance and vice- versa. Let there is an assignment 
that satisfies I clauses of MAX-SAT instance. 
Output a matching M in the graph K as fol- 
lows. For each variable x, consider the nodes 
Xi • X -. X Q ■ X ^ • Let the variable x is assigned value 



in the MAX-SAT solution. Then, add edges 
(x,xq), (x',xx) to M. Otherwise, add edges 
(x,x\), (x',xo) to M. It is easy to see that the 
resulting set M is a matching. Also, if a clause 
k satisfied in the MAX-SAT problem, then the 
matching M will have an edge common with Ek ■ 
Therefore M intersects with at least I subsets in 
C. 

Similarly, consider a solution M to the match- 
ing problem. Form a solution to the MAX-SAT 
problem as follows. Let the set E k is satisfied 
(intersects with M). Then, one of the edges in 
Ek must be present in M. Let (x,x0) ((x,xl)) 
is such an edge. Then, assign (1) to x. Be- 
cause the M is a matching, any node x will have 
at the most one edge in M incident on it, and 
both (x,Xq) and (x,x\) cannot be present M. 
This ensures that takes x will take at the most 
one value or 1 in the constructed assignment. 
For the remaining variables, assign values ran- 
domly. By construction, if a set E k is satisfied 
in the matching solution, the corresponding k th 
clause must be satisfied in the MAX-SAT prob- 
lem - resulting in a solution of size at least I to 
MAX-SAT. This completes the reduction. 

Note that in above, if we reduced from MAX- 
2-SAT, then each subset E k would contain ex- 
actly two edges, that is, we would get an in- 
stance in which each bidder bids on exactly two 
candidate-position pairs. Because MAX-2-SAT 
is NP-hard, this proves that this special case is 
also NP-hard. 

Proof of Lemma 15.11 The dual for (j3J) is: 

min q T y 

v,Q 

s.t. A k • Q + y k > ir k \/k 

Qe = e ( 14 ) 
Q T e = e 

y > o 
Q > 

The dual for © is: 

min q T y 

y,p 

s - t - Y.o( A k • M a )p a + y k > ir k Wk ( 15 ) 

Y.aVa = 1 

y>o 

Suppose p' a is a solution to (fT5l) . and 
^2 a p' a M a = Q', then the first constraint in ([15]) 
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is equivalent to A^ • Q' + Vk > ^k- Hence, for 
any solution p' a to (fT5|) . there is a corresponding 
solution Q' = '^2 a p' a M a to (Q3J) with the same 
objective value. Thus, if Q is an optimal solution 
to (fH|) . then all satisfying X^Po-Mo- = Q 
have the same objective value and are optimal. 

Proof of Lemma 15.21 The optimality condi- 
tion for the dual problem in (TTOj) is specified as 
(setting derivative to 0): 

Q = T,a eY ' M "~ lM o (16) 

Thus, given a certificate Y, verifying its opti- 
mality requires computing the function f(Y) = 
Y J e Y * M "M a for a given Y. Note that the ij th 
component of this function is given by 

e Yij ^2 e y,M<T = e y ^perm(e y ') 

cr:j=cr(i) 

where Y 1 is the matrix obtained from Y after 
removing row i and column j. We show that 
computing the permanent of e Y is ^P-hard by 
reducing it to the problem of computing perma- 
nent of a (0, 1) matrix. The reduction uses the 
technique from the proof of Theorem 1 in 0]. 
We repeat the construction below for complete- 
ness. Suppose A is a (n — 1) x (n — 1) (0, 1) 
matrix whose permanent we wish to find. Then, 
construct a matrix Y' as follows: 

, = f log(n! + 2) A kl = 1 
« \ log(n! + 1) A kl = 

Then, perm(e y ) mod (n! + 1) = perm(A) 
mod (n! + 1) = perm^), since perm(^4) < n\. 
Hence, even the verification problem for this op- 
timization problem is at least as hard as com- 
puting the permanent of a (0, l)-matrix. 

Proof of Lemma 15.31 Observe that the prob- 
lem in ([8]) involves implicit constraints Y^Pa = 
1. Further, we show that the equality constraints 
can be relaxed to inequality. We will show that 
it is impossible that ^2p a M a < Q for some el- 
ements in the optimal solution. Observe that 
the matrix Q — Pa^a has the property that 
each row and each column sums up to 1 — Yl Pa ■ 
That is, (Q — ^p (T Af cr )/(l — YlPo) 1S a doubly- 
stochastic matrix. Birkhoff-von Neumann theo- 
rem [2j proves that any doubly stochastic matrix 
can be represented as a convex combination of 



permutation matrices. Since Q — J2PaM a > 0, 
there must be at least one strictly positive co- 
efficient in the Birkhoff-von Neumann decompo- 
sition of this matrix. This means that we can 
increase at least one p a a little bit without vi- 
olating the inequality constraint. However, the 
derivative of the objective w.r.t one variable is 
log Po-. Therefore, when p a < 1 , increasing p a 
will always decrease the objective value, which 
contradicts with the assumption that we have al- 
ready reached the optimal. Thus we have shown 
that the problem in (jlip shares the same optimal 
solution as ([8]). 

Proof of Lemma 15.41 Note that Y = 
— logn x ones(n,n) forms a feasible solution to 
(fT2"j) . Hence, the optimal value to the dual must 
be greater than — re log n — 1, that is, > OPT > 
— re logn— 1. Also, from KKT conditions, the op- 
timal solutions to the primal and dual are related 
as p a = e Y * M<T . Hence, as discussed in proof 
of Lemma 15.31 for the primal solution, the opti- 
mal dual solution must satisfy ^ e Y * M ° M a = Q, 
implicitly leading to ^ e Y,M °- = l at optimal- 
ity. Along with the lower bound on OPT, this 
gives Q • Y > — nlogn, which implies Yy > 
-nlogn/q min . 

Proof of Lemma 15.51 The gradient of g(Y) is 

V9(Y) = Ea(e Y ' M ° M ^j) M «- Tha t ^, Vg{Y) 
is an n x re matrix defined as: 

Vg(Y) k i = 

( Z„-,=„ {t) e ¥ - M ° if (M)=«J) 

\ Y,a: j= a{i),l=a{k)e Y ' M ^ ^ {M} V\{h0}=<t> 
[ O.W., if {k,l}f]{i,j}^ 

We will use the notation e Y , where Y is a ma- 
trix, to mean component- wise exponentiation: 
(e Y )ij = e Yij . Let \ denote the permanent of 
the non- negative matrix e Y . Denote by Xiji the 
"permanent" of the submatrix obtained after re- 
moving row i and column j from e Y . Then, ob- 
serve that g(Y) = e Yij ■ Xij- Also, the gradient 
of g(Y) can be written as: 

V<?(Y) M = 

e Yi i ■ Xij if (*,Q=(vj) 

e Y iie Y kl . x . jM if { k,l}f]{i,j} =<)> 

o.w., if {k,i}r\{i,j}¥=4> 
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where Xij,ki denotes the permanent of the matrix 
obtained after removing rows i, k and columns 
j, I from e Y . 

Using the relation between permanent of a 
matrix and its submatrices, observe that: 

HV0POH1 = e Y " ■ XV + e Yii Yl eYkl -Xij,kl 

= ne Y ^Xij 
= ng(Y) 

Hence, 

||V 5 (Y)|| 2 < ng(Y) < rfi\\Vg(Y)\\ 2 
Proof of Lemma 15.71 For any such Y, 

f(Y) < < j^^yQ < (l+12J)g < (l+e)Q 

Proof of Lemma 15.81 Suppose the ij th con- 
straint is violated. That is, fij(Y) > (l + 35)Qij. 
This implies that fij(Y) > (1 + 5)Qij. This is 
because if fij(Y) < (1 + S)Qij, then fij(Y) < 
(l + S)f(Y)<(l + 35)Q ij . 

In below we denote the function fijiY) by 
g(Y) and Qij by b. Given any X £ K, since 
g(-) is a convex function, 

Vg(Y) T (X — Y)< g(X) - g(Y) < b - g(Y) 

Therefore, using the bounds on X and Y, 

Vg(Y) T (X-Y) 

< Vg(Y) T (X — Y) + \\X7g(Y) - Vg(Y)\\ ■ \\X - Y\ 
<b-g(Y) + 1 \\Vg(Y)\\^ 
<b-g(Y) +7 .ng(Y).^ 

< 6- 6(1 + <$)(! -7=^*^ 



m > n 



where the second last inequality follows from the 
bound on gradient given by Lemma [5.51 The last 
inequality follows from the observation made 
earlier that g(Y) = fij(Y) > 1 + 5. Now, 
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2n 4 1 + 5 n 3 log n 
Hence, from above, 

Vg(Yf(X - Y) < 
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